Functional Programming

Gerko Vink

Methodology & Statistics @ Utrecht University

9 Jun 2025

Disclaimer

I owe a debt of gratitude to many people as the thoughts and code in these slides are the process of years-long development cycles and discussions with my team, friends, colleagues and peers. When someone has contributed to the content of the slides, I have credited their authorship.

Images are either directly linked, or generated with StableDiffusion or DALL-E. That said, there is no information in this presentation that exceeds legal use of copyright materials in academic settings, or that should not be part of the public domain.

Warning

You may use any and all content in this presentation - including my name - and submit it as input to generative AI tools, with the following exception:

  • You must ensure that the content is not used for further training of the model

Slide materials and source code

Materials

Recap

Gisteren hebben we deze onderwerpen behandeld:

  • Beschrijvende statistiek
  • Kruistabellen en frequentieverdelingen
  • \(\chi ^2\)-toets
  • Andere toets- en associatiematen
  • Simpele lineaire regressie
  • Analyses draaien op groepen

Today

Vandaag behandelen we de volgende onderwerpen:

  • Zelf functies ontwikkelen, gebruiken en debuggen
  • Map / Reduce workflows
  • Binaire operators
  • Trekken uit verdelingen
  • Random number generation

Packages we use

library(dplyr)    # data manipulation
library(purrr)    # functional programming
library(furrr)    # parallel processing
library(magrittr) # flexible pipes
library(mice)     # for the boys data

# fix the random seed
set.seed(123)

Writing functions

Write your own function

The function is function()

my_function <- function(arguments) {
  
  expressions       
  
  return(output)   
                  
}
  • arguments are input of the function

  • expressions are operations performed on the arguments

  • output an object containing the output (e.g. vector, matrix, list, etc.)

  • return explicit return of the output (optional, but recommended!)

my_function <- function(arguments) {
  
  expressions       
  
  output # less clear that this is returned
                  
}

Tossing a die

A function without arguments that simulates tossing a die

die <- function() {
  # throw die
  eyes <- sample(1:6, size = 1)
  # return the outcome
  return(eyes)
}
c(die(), die(), die())
[1] 3 6 3

Defining an argument

The argument n specifies the number of throws of the die

dice <- function(n) {
  # n is the number of dice to toss
  # replace = TRUE allows for repeated outcomes
  # returns a vector of length n
  return(sample(1:6, size = n, replace = TRUE))
}

dice(10)
 [1] 2 2 6 3 5 4 6 6 1 2

Multiple returns

A function dice(n) returning a list with

  • the outcomes of the n throws, their frequencies and their mean
dice <- function(n) {
  # throw dice n times
  eyes <- sample(1:6, size = n, replace = TRUE) 
  # prepare structured output
  return(list(outcomes = eyes,
              freqs    = table(eyes),
              mean     = mean(eyes)))
}
dice(10)
$outcomes
 [1] 3 5 3 3 1 4 1 1 5 3

$freqs
eyes
1 3 4 5 
3 4 1 2 

$mean
[1] 2.9

Default arguments

The default is a fair die (each outcome has probability 1/6)

  • the user can change this if so desired
dice <- function(n, p = rep(1/6, 6)) {
  # throw dice n times with probability p
  eyes <- sample(1:6, size = n, replace = TRUE, prob = p)
  # prepare structured output
  return(list(outcomes  = eyes,
              frequency = table(eyes),
              mean      = mean(eyes)))
}
dice(100)
$outcomes
  [1] 5 5 3 2 1 1 6 6 2 4 6 3 3 3 2 4 4 4 2 2 3 4 3 1 2 4 6 2 5 3 2 6 1 4 5 2 4
 [38] 3 6 4 6 6 6 4 6 5 6 2 4 3 4 5 4 2 3 6 4 6 2 4 1 1 1 3 2 5 4 5 3 3 6 2 4 5
 [75] 5 3 4 1 4 1 1 5 4 2 1 3 2 1 6 2 5 1 5 4 5 3 3 3 4 1

$frequency
eyes
 1  2  3  4  5  6 
14 17 18 22 14 15 

$mean
[1] 3.5

Unfair dice

The following command throws 100 unfair dice

  • probabilities for rolling a 1, 2, 3, 4, 5 is 0.1
  • probability for rolling a 6 is 0.5
c(rep(.1, 5), .5)
[1] 0.1 0.1 0.1 0.1 0.1 0.5
dice(100,  p = c(rep(.1, 5), .5))
$outcomes
  [1] 6 6 6 4 4 2 4 5 3 4 2 5 1 6 6 6 6 6 2 6 6 6 6 5 2 6 6 6 6 6 3 6 6 6 3 6 4
 [38] 6 6 3 5 6 6 6 4 6 2 5 4 4 6 3 2 3 2 6 5 6 3 6 6 3 1 1 6 6 1 4 1 6 6 4 6 3
 [75] 6 1 4 3 6 2 6 6 6 6 6 6 6 4 6 5 6 6 2 1 6 1 5 4 6 6

$frequency
eyes
 1  2  3  4  5  6 
 8  9 10 13  8 52 

$mean
[1] 4.6

Applying your function

apply()

The apply() function is used to apply a function to the rows or columns of a matrix or array. It takes three main arguments: the data, the margin (1 for rows, 2 for columns), and the function to apply.

calc_mean <- function(x) {
  return(mean(x, na.rm = TRUE))
}
# select random 10 rows from the numeric columns of boys
numboys <- boys %>% select(where(is.numeric))
which_rows <- sample(1:nrow(numboys), 10)
numboys <- numboys[which_rows, ]
# over the columns
apply(numboys, FUN = calc_mean, MARGIN = 2)
     age      hgt      wgt      bmi       hc       tv 
 15.0746 168.3200  62.7300  21.2590  55.1300  10.5000 
# over the rows
apply(numboys, FUN = calc_mean, MARGIN = 1)
    5975     7062     6131     5897     4505     7073     7088     4487 
58.03867 67.37580 73.66980 68.45680 52.04250 62.35783 71.46680 48.28217 
    6693     2423 
72.43360 35.15900 

lapply()

lapply() does the same as apply(), but it is used for lists. It applies a function to each element of a list and returns a list of results.

lapply(numboys, FUN = calc_mean)
$age
[1] 15.0746

$hgt
[1] 168.32

$wgt
[1] 62.73

$bmi
[1] 21.259

$hc
[1] 55.13

$tv
[1] 10.5

sapply()

sapply() does the same as lapply(), but it simplifies the output to a vector or matrix if possible. It is useful when you want to avoid dealing with lists.

sapply(numboys, FUN = calc_mean)
     age      hgt      wgt      bmi       hc       tv 
 15.0746 168.3200  62.7300  21.2590  55.1300  10.5000 

tapply()

tapply() is used to apply a function to subsets of a vector, grouped by one or more factors. It is particularly useful for summarizing data based on grouping variables.

tapply(boys$hgt, boys$reg, FUN = calc_mean)
   north     east     west    south     city 
151.6316 133.9648 130.2783 128.0022 125.8577 
boys %>% 
  group_by(reg) %>% 
  summarise(mean_hgt = mean(hgt, na.rm = TRUE))
# A tibble: 6 × 2
  reg   mean_hgt
  <fct>    <dbl>
1 north    152. 
2 east     134. 
3 west     130. 
4 south    128. 
5 city     126. 
6 <NA>      73.0

Map / Reduce

map()

The map() function is part of the purrr package, which is designed for functional programming in R. It allows you to apply a function to each element of a list or vector, returning a list of results.

boys %>% 
  split(.$reg) %>% # split the data by region
  map(~ lm(hgt ~ age, data = .x) %>% # map the linear model function
        coef()) # extract coefficients
$north
(Intercept)         age 
  74.104664    6.376882 

$east
(Intercept)         age 
  73.229714    6.507535 

$west
(Intercept)         age 
  69.446550    6.646496 

$south
(Intercept)         age 
  70.410123    6.566541 

$city
(Intercept)         age 
  69.010565    6.724608 

split()

boys %>% 
  split(.$reg)
$north
        age   hgt     wgt   bmi   hc  gen  phb tv   reg
127   0.093  56.0   5.410 17.25 40.0 <NA> <NA> NA north
198   0.117  57.0   5.260 16.18 40.0 <NA> <NA> NA north
238   0.142  58.0   5.220 15.51 40.1 <NA> <NA> NA north
248   0.147  57.3   4.950 15.07 36.8 <NA> <NA> NA north
873   0.594  70.8   8.970 17.89 45.2 <NA> <NA> NA north
911   0.673  71.0   9.000 17.85 46.5 <NA> <NA> NA north
1212  0.996  77.1  10.390 17.47 47.1 <NA> <NA> NA north
1278  1.040  77.5   9.300 15.48 46.3 <NA> <NA> NA north
1511  1.292  79.0  10.700 17.14 47.3 <NA> <NA> NA north
1617  1.481    NA  12.040    NA 47.5 <NA> <NA> NA north
1684  1.530  80.0  10.785 16.85 46.1 <NA> <NA> NA north
1877  1.793  86.0  13.400 18.11 47.0 <NA> <NA> NA north
1882  1.798  81.8  10.535 15.74 46.7 <NA> <NA> NA north
1927  1.848    NA  13.200    NA 50.0 <NA> <NA> NA north
2044  2.020  88.3  13.000 16.67 50.0 <NA> <NA> NA north
2168  2.198  94.2  14.980 16.88 51.0 <NA> <NA> NA north
2313  2.576  86.0  10.700 14.46 49.5 <NA> <NA> NA north
2448  2.929  95.3  13.500 14.86 51.2 <NA> <NA> NA north
2583  3.184 100.0  15.500 15.50 50.3 <NA> <NA> NA north
2609  3.304 108.0  17.500 15.00 53.4 <NA> <NA> NA north
2829  4.197 108.0  18.000 15.43 51.0 <NA> <NA> NA north
2977  5.782 124.8  25.100 16.11 53.8 <NA> <NA> NA north
3145  7.540 138.0  31.000 16.27 50.5 <NA> <NA> NA north
3279  8.859 124.8  31.000 19.90 51.6   G1   P1  1 north
3357  9.119 140.0  28.000 14.28 52.1   G1   P1  2 north
3416  9.303 142.2  31.600 15.62 51.5   G1   P1  3 north
4009 11.011 148.8  44.200 19.96 55.4   G1   P2  2 north
4102 11.222 151.8  44.400 19.26 55.6   G1   P1  2 north
4253 11.655 160.6  44.400 17.21 57.6   G2   P1  2 north
4280 11.728 149.6  35.500 15.86 53.8   G1 <NA>  1 north
4532 12.457 161.4  52.600 20.19 53.0   G2   P2  3 north
4750 12.996 171.5  49.700 16.89 55.6 <NA> <NA> NA north
4848 13.190 155.4  42.100 17.43 53.9   G3   P3  6 north
5023 13.604 165.0  53.500 19.65 57.7 <NA> <NA> NA north
5113 13.839 162.1  44.900 17.08 52.5   G4   P3  8 north
5130 13.891 174.6  54.200 17.77 55.1   G4   P4 10 north
5147 13.924 144.8  35.100 16.74 54.2   G3   P4  8 north
5219 14.069 170.6  59.200 20.34 55.1   G3   P3  8 north
5237 14.099 163.2  59.500 22.33 53.7 <NA> <NA> NA north
5400 14.494 147.4  39.000 17.95 54.7 <NA> <NA> NA north
5416 14.543 173.7  69.500 23.03 57.3   G5   P5 12 north
5520 14.811 173.8  61.700 20.42 55.0   G4   P4 15 north
5554 14.877 188.5  59.300 16.68 53.0 <NA> <NA> NA north
5598 14.997 181.2  65.100 19.82 55.6   G4   P5 10 north
5610 15.025 185.5  62.700 18.22 56.4   G3   P4  8 north
5715 15.268 179.5  57.000 17.69   NA <NA> <NA> NA north
5764 15.416 187.2  80.600 22.99 59.6   G4   P4 17 north
5789 15.474 192.2  80.200 21.71 59.3   G4   P5 20 north
5807 15.507 182.1  63.000 18.99 55.4 <NA> <NA> NA north
5823 15.542 171.0  50.000 17.09 56.6   G2   P3  6 north
5830 15.556 183.3  61.500 18.30 53.6   G5   P5 15 north
5897 15.704 182.8  67.100 20.08 56.6 <NA> <NA> NA north
5975 15.912 180.0  65.200 20.12 55.0   G4   P5 12 north
5986 15.926 167.8  62.200 22.09 55.4   G4   P5 12 north
6005 15.989 187.8  64.800 18.37 56.8   G5   P6 13 north
6064 16.156 194.3 113.000 29.93 58.4   G3   P5  6 north
6067 16.175 193.5  70.000 18.69   NA <NA> <NA> NA north
6131 16.399 184.4  84.500 24.85 58.2 <NA> <NA> NA north
6209 16.605 183.1  70.200 20.93 55.8 <NA> <NA> NA north
6262 16.741 189.8  70.300 19.51 59.2   G4   P5 20 north
6311 16.900 190.2  69.400 19.18 55.7 <NA> <NA> NA north
6480 17.327 170.5  75.000 25.79   NA <NA> <NA> NA north
6522 17.426 183.2  57.900 17.25   NA <NA> <NA> NA north
6539 17.467 173.6  55.500 18.41 55.7   G4   P4 10 north
6611 17.678 176.4  72.800 23.39 56.6   G5   P5 14 north
6617 17.694 195.7  80.000 20.88 59.4 <NA> <NA> NA north
6686 17.911 181.2  86.800 26.43 58.3   G5   P5 15 north
6844 18.406 180.5  81.000 24.86 55.8 <NA> <NA> NA north
6878 18.507 193.5  92.000 24.57 59.0 <NA> <NA> NA north
6943 18.677 196.7  75.000 19.38 58.6 <NA> <NA> NA north
7023 18.929 188.8  75.800 21.26 57.3 <NA> <NA> NA north
7032 18.959 185.1  76.200 22.24 56.1   G5   P6 16 north
7047 19.011 180.3  77.500 23.84 58.5 <NA> <NA> NA north
7055 19.028 187.1  77.800 22.22 57.9 <NA> <NA> NA north
7173 19.471 191.0  87.100 23.87 58.4   G4   P6 15 north
7184 19.501 186.8  78.500 22.49 57.9 <NA> <NA> NA north
7293 19.926 192.3 117.400 31.74 57.6   G5   P6 18 north
7329 20.032 184.0  73.000 21.56 56.0 <NA> <NA> NA north
7405 20.323 182.5  69.000 20.71 59.0 <NA> <NA> NA north
7418 20.429 181.1  67.200 20.48 56.6 <NA> <NA> NA north
7451 20.813 189.0  78.000 21.83 59.9 <NA> <NA> NA north

$east
        age   hgt     wgt   bmi   hc  gen  phb tv  reg
43    0.073  54.5   4.200 14.14 38.0 <NA> <NA> NA east
60    0.079  55.0   4.430 14.64 38.0 <NA> <NA> NA east
113   0.090  55.7   4.070 13.11 36.6 <NA> <NA> NA east
167   0.101  55.8   5.060 16.25 38.5 <NA> <NA> NA east
240   0.142  57.3   5.090 15.50 39.7 <NA> <NA> NA east
399   0.188  58.5   6.030 17.61 41.5 <NA> <NA> NA east
434   0.199  61.0   5.830 15.66 39.9 <NA> <NA> NA east
463   0.224  62.5   6.310 16.15 40.3 <NA> <NA> NA east
495   0.246  60.4   5.900 16.17 40.9 <NA> <NA> NA east
514   0.251  63.2   5.670 14.19 42.2 <NA> <NA> NA east
543   0.257  61.6   6.920 18.23 39.6 <NA> <NA> NA east
577   0.262  61.0   6.200 16.66 40.6 <NA> <NA> NA east
627   0.279  58.2   5.050 14.90 41.0 <NA> <NA> NA east
770   0.495  68.0   8.600 18.59 46.0 <NA> <NA> NA east
795   0.511  67.1   7.100 15.76 42.5 <NA> <NA> NA east
1001  0.758  74.0  10.090 18.42 47.2 <NA> <NA> NA east
1006  0.761  71.0   9.020 17.89 46.3 <NA> <NA> NA east
1020  0.769  69.0   9.400 19.74 46.6 <NA> <NA> NA east
1063  0.807  74.2   9.540 17.32 46.4 <NA> <NA> NA east
1065  0.807  76.2   9.960 17.15 45.8 <NA> <NA> NA east
1122  0.919  75.0   9.840 17.49 46.8 <NA> <NA> NA east
1150  0.950  74.0   9.575 17.48 48.0 <NA> <NA> NA east
1232  1.007  74.0   9.250 16.89 47.5 <NA> <NA> NA east
1253  1.021  68.0   6.945 15.01 46.3 <NA> <NA> NA east
1301  1.062  81.2  11.940 18.10 46.6 <NA> <NA> NA east
1349  1.171  77.0  10.910 18.40 45.9 <NA> <NA> NA east
1439  1.229  82.9  10.860 15.80 48.5 <NA> <NA> NA east
1468  1.245  81.0  12.000 18.28 48.2 <NA> <NA> NA east
1473  1.248  82.7  12.100 17.69 48.3 <NA> <NA> NA east
1489  1.262  81.1  10.720 16.29 49.2 <NA> <NA> NA east
1507  1.284  82.0   9.980 14.84 48.0 <NA> <NA> NA east
1519  1.300  80.0  10.440 16.31 48.7 <NA> <NA> NA east
1618  1.481  78.7  10.385 16.76 47.6 <NA> <NA> NA east
1628  1.492  84.0  11.900 16.86 47.9 <NA> <NA> NA east
1681  1.527  86.5  12.490 16.69 50.7 <NA> <NA> NA east
1840  1.752  81.5   9.700 14.60 47.0 <NA> <NA> NA east
1876  1.790  90.2  14.500 17.82 51.2 <NA> <NA> NA east
1991  1.960    NA  11.000    NA 49.7 <NA> <NA> NA east
2018  1.990    NA  12.900    NA 48.7 <NA> <NA> NA east
2020  1.993  93.1  13.800 15.92 49.3 <NA> <NA> NA east
2162  2.190  96.5  13.800 14.81 48.7 <NA> <NA> NA east
2174  2.209  91.7  13.700 16.29 48.1 <NA> <NA> NA east
2269  2.491  88.5  11.200 14.29 48.8 <NA> <NA> NA east
2280  2.507  97.2  16.900 17.88 52.2 <NA> <NA> NA east
2289  2.524 100.4  19.400 19.24 53.2 <NA> <NA> NA east
2366  2.715  91.0  13.800 16.66 51.0 <NA> <NA> NA east
2414  2.866  94.0  16.000 18.10 50.0 <NA> <NA> NA east
2423  2.885  91.4  14.800 17.71 49.0 <NA> <NA> NA east
2476  2.981 100.5  15.200 15.04 50.2 <NA> <NA> NA east
2535  3.060 102.3  16.600 15.86 50.4 <NA> <NA> NA east
2540  3.069 104.9  17.600 15.99 51.0 <NA> <NA> NA east
2667  3.720 105.8  19.000 16.97 51.7 <NA> <NA> NA east
2673  3.731 104.0  16.900 15.62 53.1 <NA> <NA> NA east
2678  3.745 106.0  16.000 14.23 53.5 <NA> <NA> NA east
2689  3.761 111.1  19.000 15.39 51.8 <NA> <NA> NA east
2740  3.890 104.2  17.100 15.74 51.2 <NA> <NA> NA east
2874  5.196 111.1  20.900 16.93 54.0 <NA> <NA> NA east
2883  5.286 128.7  30.600 18.47 55.0 <NA> <NA> NA east
2957  5.626 119.1  23.600 16.63 53.8 <NA> <NA> NA east
2959  5.637 128.7  28.400 17.14 54.2 <NA> <NA> NA east
2985  5.837 117.3  20.300 14.75 54.4 <NA> <NA> NA east
3083  7.170 133.4  27.500 15.45 51.0 <NA> <NA> NA east
3088  7.252 118.8  21.100 14.95 49.2 <NA> <NA> NA east
3106  7.389 127.5  25.900 15.93 52.3 <NA> <NA> NA east
3131  7.501 132.5  26.500 15.09 52.5 <NA> <NA> NA east
3191  7.813 140.2  30.900 15.72 56.5 <NA> <NA> NA east
3277  8.832 134.5  31.200 17.24 53.4 <NA> <NA> NA east
3321  8.999 136.3  26.900 14.47 53.4   G1   P1  4 east
3327  9.021 141.4  29.400 14.70 53.9   G1   P1  2 east
3455  9.451 136.0  27.500 14.86 51.4   G1   P1  2 east
3481  9.511 144.5  30.300 14.51 52.6   G1   P2 12 east
3484  9.514 140.3  27.800 14.12 53.1   G1   P1  2 east
3494  9.524 140.9  32.700 16.47 53.4   G1   P1  3 east
3518  9.577 141.1  35.300 17.73 53.2 <NA> <NA> NA east
3671 10.028 148.5  47.700 21.63 56.0 <NA> <NA> NA east
3698 10.108 148.1  31.600 14.40 54.8 <NA> <NA> NA east
3723 10.160 148.4  34.200 15.52 54.7 <NA> <NA> NA east
3805 10.398 149.8  34.700 15.46 51.0   G2   P2 10 east
3827 10.447 148.7  41.000 18.54 55.7   G1   P1  2 east
3850 10.513 142.6  31.800 15.63 52.0   G2   P2 NA east
3865 10.554 146.3  40.400 18.87 53.0   G1   P1  2 east
3873 10.568 151.0  36.600 16.05 54.4   G1   P1  4 east
3880 10.581 141.2  33.800 16.95 53.7   G1   P1  3 east
3929 10.724 144.1  29.500 14.20 50.6   G1   P1  4 east
3969 10.880 148.6  36.400 16.48 54.4 <NA>   P1 NA east
3991 10.954 145.1  36.200 17.19 52.7   G2   P1 10 east
3995 10.970 151.2  39.200 17.14 53.4   G1   P2  3 east
4059 11.126 139.6  32.700 16.77 54.3   G2   P2  4 east
4173 11.446 147.9  42.200 19.29 51.4   G1   P2  3 east
4186 11.482 148.7  37.200 16.82 52.9   G1   P1  3 east
4191 11.496 156.9  39.400 16.00 53.9 <NA> <NA> NA east
4210 11.545 160.2  40.800 15.89 54.8 <NA> <NA> NA east
4227 11.583 154.0  39.800 16.78 54.9 <NA> <NA> NA east
4254 11.665 152.5  38.000 16.33   NA <NA> <NA> NA east
4273 11.709 144.7  35.200 16.81 52.3 <NA> <NA> NA east
4312 11.811 145.9  34.800 16.34 51.8   G2   P2  8 east
4457 12.232 155.5  46.300 19.14 57.0 <NA> <NA> NA east
4579 12.574 163.8  51.600 19.23 56.7   G2   P3 12 east
4745 12.991 148.7  31.100 14.06 51.0   G2   P2 15 east
4748 12.993 155.9  42.300 17.40 54.3   G4   P3  6 east
4755 13.002 156.2  34.000 13.93   NA <NA> <NA> NA east
4809 13.108 164.0  61.700 22.94 55.3   G1   P1 10 east
4824 13.127 180.0  57.800 17.83 55.5   G2   P4  8 east
4840 13.171 159.3  44.000 17.33 52.5 <NA> <NA> NA east
4847 13.188 168.1  53.400 18.89 57.3   G3   P2 15 east
4892 13.300 165.5  41.900 15.29 55.6   G2   P2  5 east
4898 13.314 165.1  42.500 15.59 53.6 <NA> <NA> NA east
4961 13.489 161.3  41.400 15.91 56.3   G3   P3 12 east
4968 13.511 178.7  56.800 17.78 51.9 <NA> <NA> NA east
4972 13.519 157.5  43.900 17.69   NA <NA> <NA> NA east
5039 13.631 179.0  54.900 17.13 53.5   G3   P4 20 east
5064 13.686 168.7  46.100 16.19 55.4   G4   P4 10 east
5085 13.749 155.5  36.500 15.09 52.9   G2   P1 12 east
5098 13.790 183.4  70.700 21.01 53.4 <NA> <NA> NA east
5186 14.001 178.6  58.600 18.37 58.9 <NA> <NA> NA east
5228 14.083 172.1  50.900 17.18 56.4   G4   P5  8 east
5293 14.220 165.5  48.000 17.52 53.0   G5   P5 20 east
5326 14.297 167.4  50.000 17.84 55.1 <NA> <NA> NA east
5471 14.666 159.8  49.000 19.18   NA <NA> <NA> NA east
5480 14.677 180.3  67.500 20.76   NA <NA> <NA> NA east
5509 14.762 168.6  47.600 16.74 52.0   G4   P4 15 east
5522 14.811 179.0  66.500 20.75 58.1   G4   P5 25 east
5571 14.934 156.0  60.500 24.86 56.9 <NA> <NA> NA east
5585 14.967 174.1  88.000 29.03 54.4   G3   P4 10 east
5594 14.984 165.8  52.300 19.02 54.6 <NA> <NA> NA east
5642 15.099 178.1  74.500 23.48 58.3   G3   P4 12 east
5665 15.145 177.4  57.500 18.27 55.3 <NA> <NA> NA east
5675 15.162 176.8  54.800 17.53 54.3   G4   P4 15 east
5692 15.200 175.4  54.200 17.61 56.3 <NA> <NA> NA east
5714 15.266 175.2  62.500 20.36 55.2   G4   P4 15 east
5733 15.323 180.1  59.000 18.18 57.4 <NA> <NA> NA east
5763 15.411 173.3  54.100 18.01 58.0   G4   P4 25 east
5769 15.427 172.8  56.700 18.98 56.0 <NA> <NA> NA east
5775 15.444 181.1  77.400 23.59 58.8 <NA> <NA> NA east
5858 15.633 186.0  67.000 19.36 53.5   G5   P6 20 east
5958 15.882 175.9  62.900 20.32   NA <NA> <NA> NA east
6023 16.030 178.9  61.900 19.34 56.5 <NA> <NA> NA east
6029 16.049 186.7  70.600 20.25 56.3   G5   P5 20 east
6249 16.714 181.1  59.300 18.08 56.7 <NA> <NA> NA east
6361 16.999 179.0  68.100 21.25 56.0   G5   P6 25 east
6482 17.333 180.3  76.800 23.62 60.5   G5   P6 20 east
6483 17.336 183.9  66.300 19.60 56.9   G4   P4 20 east
6486 17.347 180.5  76.000 23.32 56.5 <NA> <NA> NA east
6502 17.374 174.5  70.100 23.02 58.7 <NA> <NA> NA east
6528 17.440 171.4  71.700 24.40 58.7   G4   P5 20 east
6547 17.483 182.4  71.000 21.34   NA <NA> <NA> NA east
6569 17.571 187.1  78.500 22.42   NA <NA> <NA> NA east
6633 17.730 172.5  70.000 23.52 57.5 <NA> <NA> NA east
6685 17.911 183.6  61.000 18.09 56.8 <NA> <NA> NA east
6831 18.349 193.6  69.200 18.46 58.8   G5   P6 25 east
6937 18.655 189.0  60.000 16.79 56.0 <NA> <NA> NA east
6963 18.737 191.0  81.300 22.28 60.0   G5   P6 25 east
6964 18.743 192.0  99.000 26.85 59.5   G5   P6 25 east
7083 19.099 198.0  80.000 20.40 57.0 <NA> <NA> NA east
7161 19.408 192.7 100.100 26.95 58.2   G5   P5 25 east
7234 19.671 179.0  78.000 24.34 57.0 <NA> <NA> NA east
7240 19.707 172.5  70.600 23.72 55.9   G5   P6 20 east
7297 19.934 181.8  76.200 23.05 58.8   G5   P6 20 east
7319 20.010 170.0  68.800 23.80 55.5   G5   P6 25 east
7362 20.117 188.7  89.400 25.10 58.1   G5   P6 25 east
7475 21.177 181.8  76.500 23.14   NA <NA> <NA> NA east

$west
        age   hgt    wgt   bmi   hc  gen  phb tv  reg
38    0.071  53.0   3.89 13.84 35.8 <NA> <NA> NA west
39    0.071  55.1   3.88 12.77 36.8 <NA> <NA> NA west
53    0.076  58.5   5.92 17.29 40.5 <NA> <NA> NA west
75    0.082  59.5   5.10 14.40 39.1 <NA> <NA> NA west
99    0.087  59.5   5.13 14.49 38.0 <NA> <NA> NA west
103   0.087    NA   4.54    NA 39.0 <NA> <NA> NA west
138   0.093  56.0   4.80 15.30 38.0 <NA> <NA> NA west
140   0.093  53.5   4.15 14.49 39.0 <NA> <NA> NA west
159   0.098  55.4   4.35 14.17 38.1 <NA> <NA> NA west
184   0.109  58.0   5.52 16.40 38.2 <NA> <NA> NA west
189   0.112  59.0   4.82 13.84 38.0 <NA> <NA> NA west
207   0.123  62.0   5.42 14.09 40.5 <NA> <NA> NA west
226   0.134  58.5   4.95 14.46 40.0 <NA> <NA> NA west
253   0.147  58.0   5.12 15.21 38.5 <NA> <NA> NA west
258   0.150  56.0   5.26 16.77 40.5 <NA> <NA> NA west
270   0.153  57.5   4.95 14.97 39.2 <NA> <NA> NA west
284   0.158  60.5   5.75 15.70 39.5 <NA> <NA> NA west
310   0.164  55.0   3.73 12.33 37.8 <NA> <NA> NA west
317   0.167  59.0   5.08 14.59 39.8 <NA> <NA> NA west
336   0.172  59.0   5.31 15.25 38.0 <NA> <NA> NA west
345   0.172  60.0   5.09 14.13 39.0 <NA> <NA> NA west
365   0.177  61.5   6.68 17.66 42.0 <NA> <NA> NA west
366   0.177  57.5     NA    NA 40.4 <NA> <NA> NA west
416   0.191  61.8   5.56 14.55 39.9 <NA> <NA> NA west
438   0.202  60.0   4.79 13.30 38.2 <NA> <NA> NA west
489   0.243  62.5   5.58 14.28 42.0 <NA> <NA> NA west
553   0.257  63.0   7.38 18.59 41.5 <NA> <NA> NA west
555   0.257  60.0   5.74 15.94 39.5 <NA> <NA> NA west
571   0.260  58.0   4.93 14.65 38.8 <NA> <NA> NA west
599   0.268  59.5   5.53 15.62 41.0 <NA> <NA> NA west
607   0.271  61.5   6.17 16.31 41.0 <NA> <NA> NA west
623   0.276  63.0   6.14 15.46 41.0 <NA> <NA> NA west
654   0.314  63.0   7.38 18.59 41.8 <NA> <NA> NA west
670   0.339  66.7   7.09 15.93 43.0 <NA> <NA> NA west
707   0.405  67.5   7.66 16.81 44.4 <NA> <NA> NA west
755   0.481  67.5   8.65 18.98 44.8 <NA> <NA> NA west
798   0.511  69.5   8.30 17.18 44.5 <NA> <NA> NA west
804   0.514  68.5   7.03 14.98 43.2 <NA> <NA> NA west
818   0.522  67.8   7.81 16.98 43.0 <NA> <NA> NA west
856   0.561  69.5   8.86 18.34 45.5 <NA> <NA> NA west
890   0.626  70.0   9.23 18.83 46.5 <NA> <NA> NA west
902   0.654  71.0   7.96 15.79 45.0 <NA> <NA> NA west
936   0.717  67.5   8.14 17.86 45.2 <NA> <NA> NA west
964   0.739  74.0   8.64 15.77 44.1 <NA> <NA> NA west
1054  0.796  69.8   7.59 15.57 47.5 <NA> <NA> NA west
1062  0.804  74.0   9.05 16.52 45.0 <NA> <NA> NA west
1066  0.810  75.0   9.62 17.10 46.5 <NA> <NA> NA west
1079  0.829  64.0   6.90 16.84 44.0 <NA> <NA> NA west
1120  0.917  74.5  10.39 18.71 44.5 <NA> <NA> NA west
1123  0.919  78.5  11.70 18.98 47.7 <NA> <NA> NA west
1170  0.960  73.5   8.74 16.17 44.5 <NA> <NA> NA west
1177  0.969  84.0  11.47 16.25 49.0 <NA> <NA> NA west
1226  1.004  77.5  10.11 16.83 46.0 <NA> <NA> NA west
1257  1.021  77.5  10.70 17.81 46.5 <NA> <NA> NA west
1285  1.048  79.5  11.27 17.83 49.0 <NA> <NA> NA west
1314  1.086  74.5   8.92 16.07 45.1 <NA> <NA> NA west
1372  1.188  81.0  13.19 20.10 49.9 <NA> <NA> NA west
1414  1.212  80.0  11.08 17.31 49.3 <NA> <NA> NA west
1415  1.212  79.0  10.22 16.37 48.1 <NA> <NA> NA west
1522  1.305  82.5  11.10 16.30 50.5 <NA> <NA> NA west
1620  1.486  80.5  11.70 18.05 48.0 <NA> <NA> NA west
1632  1.494    NA  11.50    NA 47.0 <NA> <NA> NA west
1661  1.514  84.5  12.30 17.22 50.0 <NA> <NA> NA west
1664  1.514  80.0  10.10 15.78 47.0 <NA> <NA> NA west
1676  1.522  77.5   9.20 15.31 48.5 <NA> <NA> NA west
1687  1.530    NA  12.40    NA 47.7 <NA> <NA> NA west
1695  1.535  81.0   9.85 15.01 47.7 <NA> <NA> NA west
1710  1.552  85.0  11.97 16.56 50.0 <NA> <NA> NA west
1732  1.579  85.5  11.30 15.45 46.3 <NA> <NA> NA west
1737  1.582  85.2  14.20 19.56 48.6 <NA> <NA> NA west
1739  1.585    NA  11.70    NA 48.0 <NA> <NA> NA west
1792  1.675    NA  11.00    NA 47.2 <NA> <NA> NA west
1903  1.826  83.7  10.61 15.14 46.5 <NA> <NA> NA west
1947  1.878  87.5  11.30 14.75 49.5 <NA> <NA> NA west
1983  1.941  87.0  12.90 17.04 50.5 <NA> <NA> NA west
1990  1.957    NA  13.40    NA 50.0 <NA> <NA> NA west
2071  2.045  92.2  14.60 17.17 49.3 <NA> <NA> NA west
2122  2.127  90.5  12.00 14.65 49.1 <NA> <NA> NA west
2125  2.132  90.2  13.00 15.97 49.5 <NA> <NA> NA west
2148  2.173  91.2  14.70 17.67 51.1 <NA> <NA> NA west
2186  2.264  89.0  13.90 17.54 51.4 <NA> <NA> NA west
2242  2.455  92.5  15.20 17.76 50.0 <NA> <NA> NA west
2306  2.559  95.0  12.50 13.85 51.4 <NA> <NA> NA west
2337  2.622 100.0  15.80 15.80 52.0 <NA> <NA> NA west
2338  2.625  99.0  16.50 16.83 53.0 <NA> <NA> NA west
2457  2.956 102.0  18.40 17.68   NA <NA> <NA> NA west
2484  2.989  95.5  15.70 17.21 49.7 <NA> <NA> NA west
2531  3.058  92.6  12.80 14.92 49.3 <NA> <NA> NA west
2538  3.066  97.1  15.40 16.33 50.0 <NA> <NA> NA west
2616  3.342 100.5  18.10 17.92 51.3 <NA> <NA> NA west
2626  3.408  99.5  15.00 15.15 52.2 <NA> <NA> NA west
2651  3.583 101.0  15.90 15.58 49.0 <NA> <NA> NA west
2663  3.701 105.1  17.80 16.11 52.1 <NA> <NA> NA west
2664  3.707  98.0  14.90 15.51 49.7 <NA> <NA> NA west
2685  3.750 106.0  17.70 15.75 50.5 <NA> <NA> NA west
2772  3.956 107.0  18.30 15.98 53.4 <NA> <NA> NA west
2932  5.533 116.3  20.20 14.93 50.0 <NA> <NA> NA west
2978  5.782 119.6  21.50 15.03 53.0 <NA> <NA> NA west
2984  5.820    NA     NA    NA   NA <NA> <NA> NA west
2996  5.935 119.1  19.50 13.74 51.6 <NA> <NA> NA west
3018  6.113 124.6  27.00 17.39 51.5 <NA> <NA> NA west
3058  6.819 121.1  21.50 14.66 51.5 <NA> <NA> NA west
3060  6.841 125.7  31.20 19.74 54.1 <NA> <NA> NA west
3165  7.613 122.5  24.50 16.32 50.0 <NA> <NA> NA west
3199  7.885 133.7  29.40 16.44 53.6 <NA> <NA> NA west
3209  7.964 132.8  27.00 15.30 49.5 <NA> <NA> NA west
3323  9.004 151.2  48.20 21.08 50.5   G2   P1  2 west
3388  9.201 125.8  22.00 13.90 54.4   G1   P1  3 west
3409  9.270 140.4  32.00 16.23 52.3   G2   P1  1 west
3449  9.426 146.0  36.50 17.12 51.4   G1   P1  3 west
3453  9.440 144.9  36.50 17.38 53.0   G1   P2 NA west
3460  9.459 142.7  30.80 15.12 51.5   G1   P2  4 west
3486  9.514 138.0  31.00 16.27 50.4   G1   P1  2 west
3533  9.604 139.7  32.60 16.70 53.0   G1   P1  4 west
3609  9.834 142.0  30.30 15.02 53.4   G1   P1  3 west
3624  9.869 141.8  33.90 16.85 53.4 <NA> <NA> NA west
3641  9.946 146.5  38.60 17.98 53.6 <NA> <NA> NA west
3651  9.990 149.0  37.30 16.80 53.7   G1   P1  2 west
3727 10.171 135.2  31.90 17.45 53.0   G2   P1  3 west
3792 10.360 145.7  40.40 19.03 56.0 <NA> <NA> NA west
3814 10.422 158.8  39.80 15.78 54.2   G2   P1  4 west
3847 10.510 152.5  38.10 16.38 52.7 <NA> <NA> NA west
3994 10.967 137.4  29.60 15.67 50.5   G2   P1  9 west
4000 10.992 148.5  35.30 16.00 52.7 <NA> <NA> NA west
4006 11.003 134.3  29.10 16.13 53.5   G1   P1  1 west
4066 11.143 135.1  25.00 13.69 48.2   G1   P2  2 west
4103 11.225 154.0  37.00 15.60 52.5 <NA> <NA> NA west
4122 11.288 159.4  43.40 17.08 54.3   G1   P1  4 west
4174 11.446 152.8  43.10 18.45 55.1   G1   P1  3 west
4240 11.611 151.0  33.80 14.82 55.0   G1   P1  5 west
4266 11.690 148.0  35.20 16.07 50.9   G1   P1  2 west
4268 11.696    NA     NA    NA 54.2   G1   P1  3 west
4318 11.827 151.0  33.00 14.47 53.0   G1   P1 15 west
4332 11.874 152.0  32.50 14.06 52.0   G1   P2  6 west
4349 11.926 156.5  44.50 18.16 53.0   G1   P1  6 west
4399 12.071 151.1  34.50 15.11 55.0   G1   P1  2 west
4465 12.265 156.6  43.30 17.65 58.1   G3   P3 15 west
4481 12.292 145.8  39.20 18.44 54.1   G1   P1  3 west
4487 12.303 161.4  44.00 16.89 53.1   G2   P2  2 west
4501 12.342 158.1  54.90 21.96   NA <NA> <NA> NA west
4506 12.377 149.3  38.00 17.04   NA <NA> <NA> NA west
4569 12.542 160.5  49.10 19.06 54.4 <NA> <NA> NA west
4591 12.599 155.0  39.00 16.23 52.5   G2   P2  5 west
4604 12.629 163.2  46.60 17.49 56.2   G3   P3 NA west
4651 12.752 150.7  40.90 18.00 52.0 <NA> <NA> NA west
4682 12.821 170.2  56.00 19.33 57.5   G4   P4 12 west
4823 13.127 175.0  65.10 21.25 58.1   G2   P2  6 west
4825 13.130 156.4  40.80 16.67 55.1   G2   P2 10 west
4916 13.374 160.0  48.10 18.78 55.0 <NA> <NA> NA west
4940 13.445 167.8  51.00 18.11   NA <NA> <NA> NA west
4943 13.453 156.7  39.10 15.92 53.0 <NA> <NA> NA west
4944 13.459 185.7  77.50 22.47   NA <NA> <NA> NA west
4994 13.552 157.7  46.20 18.57 54.2   G2   P2  3 west
5026 13.615 158.9  45.00 17.82   NA <NA> <NA> NA west
5073 13.719 167.4  53.10 18.94 57.6 <NA> <NA> NA west
5115 13.842 160.5  52.40 20.34 57.5 <NA> <NA> NA west
5133 13.897 181.7  61.90 18.74 54.5   G4   P4 20 west
5151 13.927 176.6  54.50 17.47 54.0 <NA> <NA> NA west
5159 13.938 156.9  50.00 20.31 54.5   G2   P2 10 west
5165 13.954 175.4  58.40 18.98 55.0 <NA> <NA> NA west
5166 13.963 183.4  62.50 18.58   NA <NA> <NA> NA west
5213 14.058 177.2  48.00 15.28   NA <NA> <NA> NA west
5273 14.176 169.8  61.70 21.39 56.0 <NA> <NA> NA west
5330 14.302 173.5  65.80 21.85   NA <NA> <NA> NA west
5362 14.398 185.6  81.00 23.51 56.6 <NA> <NA> NA west
5363 14.398 156.1  43.20 17.72 56.0 <NA> <NA> NA west
5367 14.412 165.5  54.20 19.78 56.5   G4   P4 15 west
5410 14.527 160.7  52.00 20.13 65.0   G5   P5 12 west
5415 14.540 182.4  76.00 22.84 58.2   G5   P6 20 west
5417 14.543 176.4  51.00 16.38 55.9   G5   P4  8 west
5474 14.666 163.2  49.00 18.39 55.0 <NA> <NA> NA west
5529 14.830 160.0  43.00 16.79   NA <NA> <NA> NA west
5542 14.852 180.9  60.20 18.39 57.3 <NA> <NA> NA west
5551 14.863 169.1  66.30 23.18 55.5   G5   P6 20 west
5553 14.872 175.8  61.70 19.96 56.0 <NA> <NA> NA west
5557 14.885 175.3  56.00 18.22 57.6 <NA> <NA> NA west
5608 15.017 180.6  58.00 17.78 55.0 <NA> <NA> NA west
5619 15.055 172.5  64.00 21.50   NA <NA> <NA> NA west
5671 15.159 166.5  53.80 19.40 56.5 <NA> <NA> NA west
5710 15.249 188.0  89.00 25.18 56.0   G5   P5 20 west
5746 15.359 179.2  61.70 19.21 55.2 <NA> <NA> NA west
5756 15.394 179.8  67.30 20.81 54.4 <NA> <NA> NA west
5777 15.449 165.1  45.00 16.50 55.6 <NA> <NA> NA west
5784 15.466 179.7  64.00 19.81   NA <NA> <NA> NA west
5800 15.493 182.5 102.00 30.62 57.7 <NA> <NA> NA west
5835 15.570 177.3  63.50 20.20 57.5 <NA> <NA> NA west
5880 15.668 176.0  63.80 20.59 59.8   G5   P6 20 west
5893 15.693 167.2  59.00 21.10   NA <NA> <NA> NA west
5957 15.879 190.7  73.80 20.29 56.9 <NA> <NA> NA west
5989 15.937 185.2  56.80 16.56 56.6 <NA> <NA> NA west
6030 16.049 177.9  61.40 19.40 54.6 <NA> <NA> NA west
6033 16.062 183.9  61.50 18.18 56.0   G5   P6 15 west
6037 16.068 186.5  70.70 20.32 57.0   G3   P5 25 west
6085 16.246 183.5  76.00 22.57 55.9   G5   P5 25 west
6092 16.273 177.9  57.00 18.01 56.0   G5   P5 25 west
6119 16.364 183.1  60.40 18.01 56.5 <NA> <NA> NA west
6182 16.536 185.3  83.00 24.17 56.8 <NA> <NA> NA west
6185 16.544 178.0  65.70 20.73 58.0   G5   P6 15 west
6251 16.717 180.2  66.40 20.44 56.4   G5   P5 20 west
6283 16.807 184.3  77.00 22.66 60.3   G5   P6 20 west
6315 16.903 176.7  65.50 20.97   NA <NA> <NA> NA west
6333 16.939 173.4  54.60 18.15 56.0 <NA> <NA> NA west
6343 16.966 182.4  63.70 19.14 56.1   G5   P6 15 west
6416 17.117 183.2  69.30 20.64 56.5   G5   P6 25 west
6523 17.429 195.5  74.50 19.49 58.0 <NA> <NA> NA west
6525 17.431 182.0  65.00 19.62 56.5 <NA> <NA> NA west
6534 17.464 185.0  67.00 19.57 59.5 <NA> <NA> NA west
6562 17.544 164.3  64.00 23.70 56.5 <NA> <NA> NA west
6567 17.560 186.5  71.20 20.47 57.3   G5   P6 20 west
6568 17.560 164.3  56.40 20.89 57.0 <NA> <NA> NA west
6641 17.749 174.0  94.90 31.34 56.3   G5   P5 25 west
6647 17.757 196.2  81.00 21.04 59.7   G5   P6 15 west
6693 17.938 180.3  83.00 25.53 55.4 <NA> <NA> NA west
6700 17.957 172.2  64.50 21.75 56.3   G5   P6 20 west
6754 18.119 187.6  70.60 20.06 58.3 <NA> <NA> NA west
6829 18.343 180.7  80.00 24.50 60.3 <NA> <NA> NA west
6857 18.450 181.7  62.50 18.93   NA <NA> <NA> NA west
6862 18.458 178.2  62.40 19.65 58.5 <NA> <NA> NA west
6965 18.743 179.6  61.00 18.91 55.9 <NA> <NA> NA west
6975 18.765 180.2  74.60 22.97   NA <NA> <NA> NA west
6977 18.773 189.4  69.60 19.40 55.6   G5   P6 20 west
6981 18.792 174.8  56.00 18.32 53.9   G5   P6 15 west
7064 19.052 180.5  77.00 23.63 57.0 <NA> <NA> NA west
7073 19.077 182.7  70.00 20.97 56.4   G5   P5 25 west
7101 19.148 186.5  71.90 20.67 59.5   G5   P5 20 west
7122 19.227 180.0  66.60 20.55 55.5 <NA> <NA> NA west
7152 19.367 178.0  78.10 24.64 57.2   G5   P5 25 west
7160 19.405 173.8  59.30 19.63 55.8 <NA> <NA> NA west
7187 19.512 182.5  63.00 18.91 56.5 <NA> <NA> NA west
7192 19.526    NA     NA    NA   NA   G5   P6 20 west
7200 19.575 195.0  88.90 23.37 60.0   G5   P6 25 west
7218 19.630 181.0  65.10 19.87   NA <NA> <NA> NA west
7221 19.633 182.1  75.00 22.61 57.4   G5   P6 25 west
7247 19.739 177.0  65.50 20.90 56.1   G5   P6 20 west
7278 19.868 186.0  76.10 21.99   NA <NA> <NA> NA west
7308 19.978 189.5  88.10 24.53   NA <NA> <NA> NA west
7410 20.372 188.7  59.80 16.79 55.2 <NA> <NA> NA west
7444 20.761 189.1  88.00 24.60   NA <NA> <NA> NA west
7447 20.780 193.5  75.40 20.13   NA <NA> <NA> NA west

$south
        age   hgt    wgt   bmi   hc  gen  phb tv   reg
3     0.035  50.1  3.650 14.54 33.7 <NA> <NA> NA south
4     0.038  53.5  3.370 11.77 35.0 <NA> <NA> NA south
18    0.057  50.0  3.140 12.56 35.2 <NA> <NA> NA south
23    0.060  54.5  4.270 14.37 36.7 <NA> <NA> NA south
28    0.062  57.5  5.030 15.21 37.3 <NA> <NA> NA south
36    0.068  55.5  4.655 15.11 37.0 <NA> <NA> NA south
37    0.068  52.5  3.810 13.82 34.9 <NA> <NA> NA south
93    0.084  54.0  4.500 15.43 37.0 <NA> <NA> NA south
177   0.104  54.0  4.230 14.50 38.1 <NA> <NA> NA south
201   0.117  54.9  4.450 14.76 37.5 <NA> <NA> NA south
217   0.128  58.0  5.000 14.86 38.8 <NA> <NA> NA south
272   0.153  60.6  5.475 14.90 40.3 <NA> <NA> NA south
273   0.153  64.0  6.260 15.28 49.2 <NA> <NA> NA south
349   0.172  57.5  4.660 14.09 36.2 <NA> <NA> NA south
418   0.191  56.9  4.140 12.78 35.7 <NA> <NA> NA south
431   0.197  58.0  4.550 13.52 38.3 <NA> <NA> NA south
432   0.197  56.0  4.480 14.28 38.3 <NA> <NA> NA south
511   0.249  65.5  7.155 16.67 42.4 <NA> <NA> NA south
525   0.251  57.6  5.870 17.69 42.2 <NA> <NA> NA south
625   0.276  64.0  6.740 16.45 40.9 <NA> <NA> NA south
693   0.380  64.0  7.100 17.33 43.0 <NA> <NA> NA south
698   0.396  67.5  7.930 17.40 42.3 <NA> <NA> NA south
705   0.402  62.1  6.200 16.07 42.2 <NA> <NA> NA south
745   0.465  64.5  6.950 16.70 43.4 <NA> <NA> NA south
800   0.511  69.0  7.275 15.28 43.0 <NA> <NA> NA south
849   0.547  71.0  9.210 18.27 43.1 <NA> <NA> NA south
929   0.706  73.0  9.300 17.45 46.3 <NA> <NA> NA south
948   0.731  70.9  8.620 17.14 46.4 <NA> <NA> NA south
1042  0.783  72.0  8.800 16.97 47.5 <NA> <NA> NA south
1118  0.911  76.7 10.330 17.55 45.4 <NA> <NA> NA south
1121  0.917  72.8  9.130 17.22 45.7 <NA> <NA> NA south
1128  0.925  78.0  9.660 15.87 46.2 <NA> <NA> NA south
1154  0.950  76.0 10.620 18.38 48.2 <NA> <NA> NA south
1209  0.991  72.5 10.220 19.44 47.5 <NA> <NA> NA south
1331  1.138  77.0  8.880 14.97 46.0 <NA> <NA> NA south
1377  1.190  78.5  9.840 15.96 46.2 <NA> <NA> NA south
1384  1.193  76.5  9.960 17.01 47.0 <NA> <NA> NA south
1395  1.201  75.2 10.090 17.84 49.0 <NA> <NA> NA south
1398  1.204  77.5 11.700 19.47 46.6 <NA> <NA> NA south
1423  1.215  78.5 10.085 16.36 47.0 <NA> <NA> NA south
1450  1.237  77.5  9.230 15.36 46.5 <NA> <NA> NA south
1456  1.240  83.0 12.500 18.14 48.5 <NA> <NA> NA south
1466  1.242  76.2 10.280 17.70 47.5 <NA> <NA> NA south
1526  1.311  78.5 10.230 16.60 45.6 <NA> <NA> NA south
1579  1.426  77.2  9.800 16.44 46.0 <NA> <NA> NA south
1594  1.456  87.5 13.000 16.97 48.1 <NA> <NA> NA south
1756  1.598  85.5 12.700 17.37 49.2 <NA> <NA> NA south
1873  1.785  86.5 12.300 16.43 47.1 <NA> <NA> NA south
1887  1.801  85.3 11.450 15.73 49.0 <NA> <NA> NA south
1916  1.839    NA 12.300    NA 49.3 <NA> <NA> NA south
1926  1.845  88.0 12.200 15.75 48.8 <NA> <NA> NA south
1936  1.859  85.5 12.800 17.50 48.5 <NA> <NA> NA south
1940  1.867    NA 15.000    NA 49.3 <NA> <NA> NA south
1969  1.911    NA 12.400    NA 50.2 <NA> <NA> NA south
2000  1.973    NA 12.300    NA 48.9 <NA> <NA> NA south
2008  1.979  89.0 12.100 15.27 49.5 <NA> <NA> NA south
2009  1.979    NA 11.600    NA 49.1 <NA> <NA> NA south
2076  2.050  85.7 12.700 17.29 46.3 <NA> <NA> NA south
2113  2.113  94.2 14.800 16.67 49.8 <NA> <NA> NA south
2139  2.157  93.9 13.700 15.53 47.6 <NA> <NA> NA south
2144  2.168  91.2 11.500 13.82 48.6 <NA> <NA> NA south
2154  2.182  88.4 14.000 17.91 51.1 <NA> <NA> NA south
2180  2.228  91.6 12.800 15.25 47.6 <NA> <NA> NA south
2220  2.381  91.5 11.500 13.73 49.9 <NA> <NA> NA south
2227  2.412  96.2 15.000 16.20 49.9 <NA> <NA> NA south
2249  2.458  92.0 15.600 18.43 51.0 <NA> <NA> NA south
2277  2.502  91.1 15.800 19.03 49.7 <NA> <NA> NA south
2355  2.672  97.9 16.100 16.79 51.6 <NA> <NA> NA south
2418  2.872  95.2 14.400 15.88 49.8 <NA> <NA> NA south
2419  2.874  95.0 14.000 15.51 49.5 <NA> <NA> NA south
2436  2.913  96.5 14.500 15.57 46.2 <NA> <NA> NA south
2519  3.041  94.5 14.700 16.46 52.0 <NA> <NA> NA south
2522  3.047  94.0 13.800 15.61 49.3 <NA> <NA> NA south
2532  3.058  97.8 14.200 14.84 47.9 <NA> <NA> NA south
2552  3.093  98.7 14.800 15.19 50.7 <NA> <NA> NA south
2553  3.093  99.6 15.300 15.42 52.2 <NA> <NA> NA south
2561  3.107 103.5 16.200 15.12 52.0 <NA> <NA> NA south
2565  3.115 100.0 16.000 16.00 49.7 <NA> <NA> NA south
2748  3.912 108.1 21.000 17.97 52.5 <NA> <NA> NA south
2777  3.964 104.7 14.600 13.31 47.6 <NA> <NA> NA south
2792  4.013 105.2 16.000 14.45 49.0 <NA> <NA> NA south
2811  4.046 104.5 16.350 14.97 50.0 <NA> <NA> NA south
2841  4.295 106.5 17.200 15.16 49.2 <NA> <NA> NA south
2850  4.383 106.3 19.600 17.34 50.3 <NA> <NA> NA south
2885  5.292 109.3 19.600 16.40 52.4 <NA> <NA> NA south
2891  5.333 112.9 19.100 14.98 53.0 <NA> <NA> NA south
3005  6.017 128.1 27.800 16.94 53.3 <NA> <NA> NA south
3107  7.394 135.9 45.300 24.52 52.8 <NA> <NA> NA south
3111  7.419 131.0 27.500 16.02 53.8 <NA> <NA> NA south
3179  7.709 132.0 34.600 19.85 55.2 <NA> <NA> NA south
3283  8.867 145.0 38.200 18.16 54.8   G2   P1  2 south
3296  8.908 137.8 30.000 15.79 54.7   G1   P1  2 south
3300  8.925 140.2 37.200 18.92 55.9 <NA> <NA> NA south
3315  8.977 123.0 24.900 16.45 53.8 <NA> <NA> NA south
3329  9.021 132.7 30.000 17.03 53.6 <NA> <NA> NA south
3398  9.234 139.8 35.600 18.21 53.0   G2   P1  2 south
3422  9.316 147.4 31.400 14.45 53.2   G1   P1  2 south
3429  9.368 132.7 25.900 14.70 52.7   G1   P1  2 south
3442  9.407 134.4 27.000 14.94 51.8   G1   P1  6 south
3525  9.582 134.0 27.500 15.31 51.0   G2   P2  3 south
3547  9.631 139.7 28.700 14.70 53.0   G2   P1  2 south
3596  9.768 148.2 32.800 14.93 55.8   G2   P1 NA south
3664 10.020 137.2 31.700 16.84 54.1   G1   P1  2 south
3710 10.132 134.0 26.500 14.75 50.1   G1   P1  3 south
3721 10.154 139.3 30.600 15.76 51.5   G2   P2  2 south
3841 10.499 148.6 38.600 17.48 54.4   G2   P1  4 south
3922 10.713 137.4 26.100 13.82 50.8   G1   P1 NA south
3924 10.715 135.2 26.500 14.49 51.3 <NA> <NA> NA south
3975 10.888 147.0 33.800 15.64 54.0   G1   P1  1 south
4024 11.052 150.7 34.100 15.01 52.7 <NA> <NA> NA south
4067 11.143 148.3 41.500 18.86 53.7   G2   P2 10 south
4070 11.156 163.0 44.500 16.74 57.8   G2   P2  4 south
4072 11.159 144.5 49.700 23.80 57.2   G1   P1  4 south
4238 11.605 155.2 36.700 15.23 54.6   G2   P1  6 south
4301 11.789 135.4 29.700 16.20 54.2   G2   P2  5 south
4505 12.375 157.2 61.000 24.68 54.0   G1   P1  3 south
4552 12.501 170.5 53.400 18.36 56.4   G2   P1  6 south
4585 12.583 163.3 52.600 19.72 56.2   G3   P3 15 south
4646 12.741 172.0 79.500 26.87 55.0   G2   P3  8 south
4658 12.766 167.2 50.600 18.09 56.9 <NA> <NA> NA south
4721 12.933 169.5 54.800 19.07 55.7   G3   P3  8 south
4727 12.944 157.0 41.200 16.71 55.0   G2   P2  8 south
4770 13.021 172.4 55.800 18.77 57.3 <NA> <NA> NA south
4845 13.182 161.5 45.000 17.25 55.2 <NA> <NA> NA south
4959 13.486 156.2 46.600 19.09 55.6   G2   P2 NA south
5044 13.642 153.4 40.600 17.25 55.6   G3   P2  4 south
5126 13.883 176.2 48.100 15.49 53.9   G3   P4  8 south
5131 13.894 161.9 47.300 18.04 51.2 <NA> <NA> NA south
5206 14.045 170.0 54.700 18.92 52.7   G3   P4 10 south
5247 14.121 159.2 42.700 16.84 53.6   G3   P3 10 south
5288 14.209 170.9 54.800 18.76 53.9   G4   P4 15 south
5335 14.308 171.0 56.000 19.15 53.1   G4   P4 20 south
5369 14.414 166.5 53.900 19.44 56.7 <NA> <NA> NA south
5420 14.546 167.3 61.000 21.79 55.0   G4   P2 10 south
5478 14.669 182.9 57.200 17.09 56.7   G5   P5 20 south
5496 14.721 169.1 50.700 17.73 56.3   G4   P3 10 south
5516 14.787 168.5 54.200 19.08 55.0 <NA> <NA> NA south
5534 14.839 178.5 49.900 15.66 56.6 <NA> <NA> NA south
5539 14.844 172.7 64.300 21.55 56.0   G4   P4 12 south
5567 14.926 177.4 58.300 18.52 56.0   G5   P5 20 south
5602 15.003 188.0 91.600 25.91 59.8   G4   P4 12 south
5612 15.028 178.5 54.100 16.97 57.0   G5   P4 20 south
5654 15.129 176.9 58.600 18.72 54.3   G4   P5 20 south
5656 15.132 167.3 59.800 21.36 54.0 <NA> <NA> NA south
5687 15.189 165.2 45.400 16.63 54.7 <NA> <NA> NA south
5806 15.504 172.0 52.300 17.67 56.9   G5   P5 20 south
5857 15.630 174.3 52.600 17.31 54.8   G3   P3 15 south
5875 15.657 178.8 71.100 22.23 59.1 <NA> <NA> NA south
5879 15.663 172.7 58.600 19.64 57.1   G5   P6 15 south
5883 15.674 176.6 56.900 18.24 56.8   G5   P5 25 south
5930 15.794 168.9 49.100 17.21 55.2 <NA> <NA> NA south
5947 15.838 177.0 63.500 20.26 58.0   G5   P6 25 south
5963 15.890 173.8 52.100 17.24 58.4 <NA> <NA> NA south
5996 15.961 183.0 73.000 21.79 57.0 <NA> <NA> NA south
6036 16.065 184.4 68.500 20.14 55.3   G5   P5 15 south
6040 16.090 180.6 90.400 27.71 58.8 <NA> <NA> NA south
6047 16.109 170.8 48.500 16.62 54.7 <NA> <NA> NA south
6083 16.235 185.4 60.000 17.45 54.3   G5   P5 20 south
6089 16.262 176.8 57.800 18.49   NA <NA> <NA> NA south
6117 16.355 171.0 59.100 20.21 58.0   G4   P4 20 south
6132 16.402 173.6 54.500 18.08 54.0   G5   P4 25 south
6138 16.427 195.5 69.000 18.05 56.0   G5   P6 25 south
6141 16.435 175.1 64.500 21.03 58.2   G5   P5 25 south
6166 16.492 188.0 62.500 17.68 57.5   G5   P5 25 south
6214 16.616 170.5 47.600 16.37 55.2 <NA> <NA> NA south
6372 17.018 183.0 65.500 19.55 57.4   G5   P5 20 south
6386 17.056 184.0 67.000 19.78 58.5 <NA> <NA> NA south
6439 17.204 177.5 66.000 20.94 50.5 <NA> <NA> NA south
6566 17.555 190.0 97.000 26.86 59.0 <NA> <NA> NA south
6583 17.604 168.6 64.500 22.69 57.7 <NA> <NA> NA south
6614 17.681 164.9 64.100 23.57 60.2 <NA> <NA> NA south
6724 18.039 187.1 87.700 25.05 58.8 <NA> <NA> NA south
6744 18.094 170.0 51.000 17.64 57.0 <NA> <NA> NA south
6756 18.121 171.6 58.400 19.83 57.1   G5   P6 15 south
6782 18.209 177.9 63.400 20.03 58.2   G4   P5 15 south
6789 18.220 187.4 79.000 22.49 58.4   G5   P6 25 south
6886 18.532 186.5 67.300 19.34 58.2 <NA> <NA> NA south
6892 18.551 193.0 71.700 19.24 56.9   G4   P6 12 south
6916 18.606 172.4 57.200 19.24 55.2 <NA> <NA> NA south
6923 18.617 188.0 61.900 17.51 55.5   G5   P5 20 south
6961 18.735 191.5 79.000 21.54 59.0 <NA> <NA> NA south
6973 18.762 181.5 79.900 24.25 56.9 <NA> <NA> NA south
7062 19.049 176.9 65.200 20.83 54.9 <NA> <NA> NA south
7066 19.060 180.8 93.800 28.69 55.8   G4   P5 15 south
7068 19.063 175.0 72.400 23.64 58.0   G5   P6 25 south
7088 19.104 186.1 72.500 20.93 58.7 <NA> <NA> NA south
7132 19.282 185.2 71.500 20.84 57.8 <NA> <NA> NA south
7135 19.290 180.9 94.400 28.84 58.6 <NA> <NA> NA south
7141 19.310 177.1 60.100 19.16 57.8   G5   P6 20 south
7257 19.778 178.2 62.500 19.68 57.8 <NA> <NA> NA south
7396 20.281 185.1 81.100 23.67 58.8   G5   P6 20 south

$city
        age   hgt    wgt   bmi   hc  gen  phb tv  reg
62    0.079  58.5  5.745 16.78 38.5 <NA> <NA> NA city
251   0.147  57.0  5.170 15.91 40.2 <NA> <NA> NA city
328   0.169  60.3  4.990 13.72 40.3 <NA> <NA> NA city
337   0.172  58.5  5.080 14.84   NA <NA> <NA> NA city
520   0.251  60.0  6.495 18.04 38.5 <NA> <NA> NA city
532   0.254  65.5  6.720 15.66   NA <NA> <NA> NA city
552   0.257  59.0  6.260 17.98   NA <NA> <NA> NA city
608   0.271  60.5  6.360 17.37 41.6 <NA> <NA> NA city
648   0.295  61.5  6.540 17.29 43.0 <NA> <NA> NA city
666   0.334  66.5  7.190 16.25 43.6 <NA> <NA> NA city
706   0.405  64.0  7.950 19.40 44.0 <NA> <NA> NA city
850   0.550  67.0  7.215 16.07 43.4 <NA> <NA> NA city
851   0.555  67.1  7.785 17.29 44.5 <NA> <NA> NA city
919   0.689  73.0  7.740 14.52   NA <NA> <NA> NA city
926   0.703  68.0  7.660 16.56 44.1 <NA> <NA> NA city
934   0.714  76.1 10.910 18.83 45.0 <NA> <NA> NA city
1031  0.777  75.0  9.425 16.75 45.5 <NA> <NA> NA city
1046  0.788  76.0 10.016 17.34   NA <NA> <NA> NA city
1188  0.977  78.3 11.040 18.00 47.0 <NA> <NA> NA city
1299  1.062  74.1 10.560 19.23   NA <NA> <NA> NA city
1363  1.182  78.6  9.410 15.23   NA <NA> <NA> NA city
1409  1.210  78.0 10.150 16.68   NA <NA> <NA> NA city
1463  1.242  76.0 10.100 17.48 46.6 <NA> <NA> NA city
1524  1.308  81.0 10.410 15.86 47.3 <NA> <NA> NA city
1805  1.697    NA 14.400    NA 48.1 <NA> <NA> NA city
1932  1.853  90.2 14.050 17.26 51.5 <NA> <NA> NA city
1979  1.938    NA 15.200    NA   NA <NA> <NA> NA city
2104  2.088  90.2 13.000 15.97 48.5 <NA> <NA> NA city
2555  3.099  99.5 13.600 13.73 49.6 <NA> <NA> NA city
2632  3.433  99.0 17.000 17.34 53.0 <NA> <NA> NA city
2713  3.827  99.0 14.500 14.79 52.3 <NA> <NA> NA city
2892  5.338 123.0 28.100 18.57 52.5 <NA> <NA> NA city
3075  7.041 130.5 27.000 15.85 55.0 <NA> <NA> NA city
3161  7.600 148.5 35.500 16.09 53.6 <NA> <NA> NA city
3196  7.849 132.1 42.500 24.35 55.2   G1   P1 NA city
3334  9.034 139.6 33.800 17.34 55.0 <NA> <NA> NA city
3722 10.157 153.1 40.800 17.40 56.1   G1   P1 NA city
3724 10.160 141.3 39.500 19.78 52.4   G1   P1  4 city
3834 10.477 142.6 32.500 15.98 53.0   G2   P2  4 city
3988 10.945 149.0 45.600 20.53 56.1   G1   P2  4 city
4211 11.545 153.2 42.500 18.10 54.4   G1   P1  2 city
4255 11.665 144.5 30.800 14.75 51.5   G2   P2  5 city
4293 11.759 153.2 42.800 18.23 53.4   G2   P2  2 city
4302 11.791 152.8 43.500 18.63 54.2   G2   P1  3 city
4561 12.520 162.1 44.100 16.78 52.6   G2   P3  4 city
4612 12.646 169.6 58.200 20.23 56.6 <NA> <NA> NA city
4752 12.996 158.9 49.100 19.44 59.0   G3   P3 10 city
4887 13.275 161.2 37.000 14.23 53.6   G3   P2  4 city
4894 13.300 147.2 32.500 14.99 52.1   G2   P2 NA city
4966 13.500 161.5 60.200 23.08 56.1   G2   P2 NA city
4997 13.555 159.1 45.500 17.97 54.5   G4   P4 NA city
5012 13.577 176.7 60.500 19.37 54.9 <NA> <NA> NA city
5047 13.656 146.9 34.700 16.07 55.1   G2   P3 NA city
5048 13.656 175.4 74.800 24.31 59.2   G4   P5 20 city
5234 14.094 169.1 54.000 18.88 55.5 <NA> <NA> NA city
5327 14.297 153.2 44.300 18.87 50.5   G2   P3 12 city
5343 14.332 164.1 49.100 18.23 54.9   G2   P2  8 city
5463 14.650 179.8 71.100 21.99 58.0   G5   P5 NA city
5731 15.304 175.0 63.500 20.73 58.0   G4   P4 NA city
5813 15.520 162.5 51.500 19.50 54.5   G2   P2 NA city
5856 15.622 184.1 70.500 20.80 58.4   G4   P4 15 city
5964 15.893 168.6 56.000 19.70 54.9   G4   P5 15 city
5971 15.906 176.2 57.500 18.52 57.6   G4   P5 12 city
6090 16.268 170.0 66.000 22.83 57.0   G5   P5 NA city
6144 16.443 190.5 60.500 16.67 57.0   G5   P5 NA city
6187 16.547 181.8 72.000 21.78 57.2   G5   P5 NA city
6253 16.720 192.8 88.300 23.75 57.8   G5   P5 25 city
6375 17.023 165.2 56.200 20.59 56.7 <NA> <NA> NA city
6573 17.579 181.6 66.000 20.01 55.5 <NA> <NA> NA city
6858 18.453 170.5 53.500 18.40 55.3   G5   P5 20 city
7001 18.850 179.8 62.600 19.36 55.9   G4   P6 25 city
7300 19.942 193.1 93.600 25.10 60.0   G5   P5 NA city
7328 20.030 178.6 71.000 22.25 57.2   G5   P5 25 city

map()

The map() function is particularly useful for iterating over lists or vectors and applying a function to each element. It can be used to perform operations like calculations, transformations, or data extraction.

out <- boys %>% 
  split(.$reg) %>% # split the data by region
  map(~ lm(hgt ~ age, data = .x) %>% # map the linear model function
        coef()) # extract coefficients

is.list(out)
[1] TRUE
names(out)
[1] "north" "east"  "west"  "south" "city" 
out$city
(Intercept)         age 
  69.010565    6.724608 

map() on large lists

In the below example, we take a bootstrap sample (with replacement) from the boys data 1000 times, and then run a simple linear model on all 1000 bootstrap samples seperately.

sample_rows <- function(x) {
  out <- x[sample(1:nrow(x), replace = TRUE), ] 
  return(out)
}
samples <- replicate(n = 1000, expr = sample_rows(boys), simplify = FALSE)
samples_lm <- 
  samples %>% 
  map(~.x %$% 
        lm(hgt ~ age) %>% 
        coef())

map() on large lists

# how many samples?
length(samples)
[1] 1000
# what is the first sample?
samples[[1]] %>% slice_head()
        age   hgt  wgt   bmi hc  gen  phb tv  reg
4501 12.342 158.1 54.9 21.96 NA <NA> <NA> NA west
# what is the first sample's linear model?
samples_lm[[1]]
(Intercept)         age 
  70.975166    6.573117 
# what are the first three samples' linear model?
samples_lm[1:3]
[[1]]
(Intercept)         age 
  70.975166    6.573117 

[[2]]
(Intercept)         age 
  69.902756    6.687102 

[[3]]
(Intercept)         age 
  70.595459    6.583677 
# how many linear models in total?
length(samples_lm)
[1] 1000

Reduce()

With reduce(), you can combine the results obtained with map().

reduce(samples_lm, `+`) # sum of lm coefficients
(Intercept)         age 
  70681.662    6594.446 
reduce(samples_lm, `+`) / 1000 # average lm coefficients
(Intercept)         age 
  70.681662    6.594446 

structuring the output of map()

map() returns a list, which can be structured into a data frame using map_df(). This is useful when you want to convert the results of map() into a tidy format.

# with map_df() instead of map() to return a data frame
samples %>% 
  map_df(~.x %$%  
           lm(hgt ~ age) %>% 
           coef())
# A tibble: 1,000 × 2
   `(Intercept)`   age
           <dbl> <dbl>
 1          71.0  6.57
 2          69.9  6.69
 3          70.6  6.58
 4          70.8  6.56
 5          70.2  6.56
 6          70.2  6.67
 7          70.8  6.67
 8          72.2  6.48
 9          69.7  6.71
10          70.9  6.57
# ℹ 990 more rows

structuring the output of map()

# with map_df() if an object already exists as a list
samples_lm %>% 
  map_df(~.x %>% tibble(intercept = .x[1], slope = .x[2]))
# A tibble: 2,000 × 3
       . intercept slope
   <dbl>     <dbl> <dbl>
 1 71.0       71.0  6.57
 2  6.57      71.0  6.57
 3 69.9       69.9  6.69
 4  6.69      69.9  6.69
 5 70.6       70.6  6.58
 6  6.58      70.6  6.58
 7 70.8       70.8  6.56
 8  6.56      70.8  6.56
 9 70.2       70.2  6.56
10  6.56      70.2  6.56
# ℹ 1,990 more rows
map_df(samples_lm, ~tibble(intercept = .x[1], slope = .x[2]))
# A tibble: 1,000 × 2
   intercept slope
       <dbl> <dbl>
 1      71.0  6.57
 2      69.9  6.69
 3      70.6  6.58
 4      70.8  6.56
 5      70.2  6.56
 6      70.2  6.67
 7      70.8  6.67
 8      72.2  6.48
 9      69.7  6.71
10      70.9  6.57
# ℹ 990 more rows

futures

  • The future package enables asynchronous and parallel processing in R.
  • It allows R to perform tasks in the background, freeing up your current R session.
  • Ideal for:
    • Speeding up long-running computations
    • Running tasks concurrently

Why Use future?

  • Normally, R runs code line-by-line (sequentially).
  • future lets you run tasks in parallel, improving efficiency.
  • Example use cases:
    • Simulations
    • Data processing across multiple cores
    • Web scraping multiple pages

future_map()

The future_map() function is part of the furrr package, which integrates the future package with the purrr package’s mapping functions.

It allows you to apply a function to each element of a list or vector in parallel, making it easier to handle large datasets or computationally intensive tasks.

# Set up parallel processing
plan(multisession) # Use multiple cores for parallel processing
samples %>% 
  future_map_dfr(~.x %$% 
           lm(hgt ~ age) %>% 
           coef())
# A tibble: 1,000 × 2
   `(Intercept)`   age
           <dbl> <dbl>
 1          71.0  6.57
 2          69.9  6.69
 3          70.6  6.58
 4          70.8  6.56
 5          70.2  6.56
 6          70.2  6.67
 7          70.8  6.67
 8          72.2  6.48
 9          69.7  6.71
10          70.9  6.57
# ℹ 990 more rows
plan(sequential) # Stop parallel processing and reset to sequential processing

future_map()

The future_map() function is part of the furrr package, which integrates the future package with the purrr package’s mapping functions.

It allows you to apply a function to each element of a list or vector in parallel, making it easier to handle large datasets or computationally intensive tasks.

# Set up parallel processing
plan(multisession) # Use multiple cores for parallel processing
samples %>% 
  future_map_dfc(~.x %$% 
           lm(hgt ~ age) %>% 
           coef())
# A tibble: 2 × 1,000
   ...1  ...2  ...3  ...4  ...5  ...6  ...7  ...8  ...9 ...10 ...11 ...12 ...13
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 71.0  69.9  70.6  70.8  70.2  70.2  70.8  72.2  69.7  70.9  70.3  70.5  69.5 
2  6.57  6.69  6.58  6.56  6.56  6.67  6.67  6.48  6.71  6.57  6.61  6.55  6.68
# ℹ 987 more variables: ...14 <dbl>, ...15 <dbl>, ...16 <dbl>, ...17 <dbl>,
#   ...18 <dbl>, ...19 <dbl>, ...20 <dbl>, ...21 <dbl>, ...22 <dbl>,
#   ...23 <dbl>, ...24 <dbl>, ...25 <dbl>, ...26 <dbl>, ...27 <dbl>,
#   ...28 <dbl>, ...29 <dbl>, ...30 <dbl>, ...31 <dbl>, ...32 <dbl>,
#   ...33 <dbl>, ...34 <dbl>, ...35 <dbl>, ...36 <dbl>, ...37 <dbl>,
#   ...38 <dbl>, ...39 <dbl>, ...40 <dbl>, ...41 <dbl>, ...42 <dbl>,
#   ...43 <dbl>, ...44 <dbl>, ...45 <dbl>, ...46 <dbl>, ...47 <dbl>, …
plan(sequential) # Stop parallel processing and reset to sequential processing

Binairy operators

How binairy operators work

Binary operators are functions that take two arguments and return a single value. In R, you can create your own binary operators using the "%operator%" syntax.

`%my_operator%` <- function(x, y) {
  # perform some operation on x and y
  result <- x + y  # example operation: addition
  return(result)
}

4 %my_operator% 5
[1] 9
1:4 %my_operator% 5:8
[1]  6  8 10 12

Binary operators allow you to write the function in a more natural way, similar to mathematical notation. You can use them for various operations, such as addition, subtraction, multiplication, or any custom operation you define.

Drawing from distributions

Normal distribution

# Draw 10000 random numbers from a normal distribution
normals <- rnorm(10000, mean = 0, sd = 1)
# Plot the histogram of the random numbers
hist(normals, 
     breaks = 30, 
     main = "Histogram of Random Normal Numbers", 
     xlab = "Value", 
     ylab = "Frequency")

Uniform distribution

# Draw 10000 random numbers from a uniform distribution
uniforms <- runif(10000, min = 0, max = 1)
# Plot the histogram of the random numbers
hist(uniforms, 
     breaks = 80, 
     main = "Histogram of Random Uniform Numbers", 
     xlab = "Value", 
     ylab = "Frequency")

Random number generators

How PRNGs work

Pseudo Random Number Generators (PRNGs) are algorithms that generate sequences of numbers that approximate the properties of random numbers. They are called “pseudo” because they are deterministic and can be reproduced if the initial state (seed) is known.

# fix the seed
set.seed(123)
# draw 10 random integers between 1 and 100 without replacement
sample(1:100, size = 10, replace = FALSE)
 [1] 31 79 51 14 67 42 50 43 97 25
# fix the seed again
set.seed(123)
# draw 10 random integers between 1 and 100 without replacement
sample(1:100, size = 10, replace = FALSE)
 [1] 31 79 51 14 67 42 50 43 97 25
# draw the same 10 random numbers in sets of 5
set.seed(123)
sample(1:100, size = 5, replace = FALSE)
[1] 31 79 51 14 67
sample(1:100, size = 5, replace = FALSE)
[1] 42 50 43 14 25

BEWARE: once you fix the random seed

How PRNGs work

Pseudo Random Number Generators (PRNGs) are algorithms that generate sequences of numbers that approximate the properties of random numbers. They are called “pseudo” because they are deterministic and can be reproduced if the initial state (seed) is known.

# draw 10 numbers
set.seed(123)
rnorm(10)
 [1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774  1.71506499
 [7]  0.46091621 -1.26506123 -0.68685285 -0.44566197
# draw 10 numbers in sets of 5
set.seed(123)
rnorm(5)
[1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774
rnorm(5)
[1]  1.7150650  0.4609162 -1.2650612 -0.6868529 -0.4456620
# draw 15 numbers in two sets, where the first set is 5 numbers
set.seed(123)
rnorm(5)
[1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774
rnorm(10)
 [1]  1.7150650  0.4609162 -1.2650612 -0.6868529 -0.4456620  1.2240818
 [7]  0.3598138  0.4007715  0.1106827 -0.5558411
# draw 15 numbers in two sets, where the first set is 10 numbers
set.seed(123)
rnorm(10)
 [1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774  1.71506499
 [7]  0.46091621 -1.26506123 -0.68685285 -0.44566197
rnorm(5)
[1]  1.2240818  0.3598138  0.4007715  0.1106827 -0.5558411

Replication vs reproduction

Reproduction is the process of running the same analysis with the same data and code to see if the results can be exactly replicated.

# reproduction
set.seed(123)
rnorm(10) %>% mean()
[1] 0.07462564
set.seed(123)
rnorm(10) %>% mean()
[1] 0.07462564

Replication is the process of running the same analysis on a different dataset or in a different context to see if the results are consistent.

# replication WITH reproduction
set.seed(123)
rnorm(10) %>% mean()
[1] 0.07462564
set.seed(124)
rnorm(10) %>% mean() 
[1] 0.2147669

Practical